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ABSTRACT 

Given the basic parameters of a cosmic shear weak lensing survey, how well can systematic 
errors due to anisotropy in the point spread function (PSF) be corrected? The largest source 
of error in this correction to date has been the interpolation of the PSF to the locations of 
the galaxies. To address this error, we separate the PSF patterns into components that recur in 
multiple exposures/pointings and those that vary randomly between different exposures (such as 
those due to the atmosphere). In an earher study we developed a principal component approach 
to correct the recurring PSF patterns (Jarvis & Jain 2004). In this paper we show how randomly 
varying PSF patterns can also be circumvented in the measurement of shear correlations. For 
the two-point correlation function this is done by simply using pairs of galaxy shapes measured 
in different exposures. Combining the two techniques allows us to tackle generic combinations 
of PSF anisotropy patterns. The second goal of this paper is to give a formalism for quantifying 
residual systematic errors due to PSF patterns. We show how the main PSF corrections improve 
with increasing survey area (and thus can stay below the reduced statistical errors), and we 
identify the residual errors which do not scale with survey area. Our formalism can be applied 
both to planned lensing surveys to optimize survey strategy and to actual lensing data to quantify 
residual errors. 

Subject headings: cosmology :gravitational lensing — methods :data analysis 



1. Introduction 



Weak gravitational lensing refers to the coherent distortions of background galaxy images by mass 
structures along the line of sight. Lensing measurements from imaging surveys have emerged as a powerful 
probe of cosmology. With planned surveys that will cover thousands of square degrees, the statistical errors 
on measured shear correlations will be extremely small (e.g. the Dark Energy Survey (Abbott et al. 2005), 
PanSTARRS (Kaiser 2004), LSST (Starr et al. 2002), and SNAP (Lampton et al. 2002)). However system- 
atic errors may exceed the statistical errors and dominate the error budget on cosmological parameters. 

To analyze the effect of errors on lensing statistics, we will consider the two-point correlation functions 
of the shear ^^+(0), {9) and the shear power spectrum C^{1). Other statistics often used in lensing mea- 
surements, such as the aperture mass variance and the top-hat shear variance, can be obtained by integrating 
the two-point correlation functions (Schneider et al. 2002), so there is no need to consider them in addition. 
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Fig. 1. — Statistical errors in the lensing power spectrum for tliree survey parameters. The upper black curve shows 
the shear power spectrum for source galaxies at z = 1. The three lower curves show the statistical errors (the sum 
of sample variance and intrinsic ellipticity contribution) for three survey parameters described in the text. Since the 
PSF anisotropy can be of order 1%, its contribution (~ 10^'*) would greatly exceed the statistical errors. It must be 
corrected by several orders of magnitude to fall below the statistical errors for these survey parameters. 



The contribution to the shear correlations per log interval in i is £^C^(£)/27r, which makes it an intuitive 
way to plot the shear power spectrum. Figure 1 shows this power spectrum for the (nonlinearly evolved) 
concordance A— CDM model, and the statistical errors on it for three choices of survey parameters. The 
statistical errors include sample variance, which dominates on large scales (£ below a few thousand) and the 
shot noise contribution of the intrinsic ellipticites of galaxies. The systematic errors should be smaller than 
the sum of these so as not to dominate the error budget. 

The upper black curve shows the shear power spectrum for source galaxies at redshift z = \. The 
lowest statistical errors (long-dashed blue curve) are for a survey similar to that planned for the LSST, 
which covers half the sky: /sky = 0.5, Ug = 40 with rms intrinsic contribution to the shear = 0.25. 
The dashed red curve is for a ground based survey with smaller sky coverage: /sky = 0.1, Ug = 40. The 
dotted green curve shows the statistical error for a space based survey with /sky = 0.1, Ug = 100. The 
higher number density reduces the error at high £ compared to the ground based survey with the same sky 
coverage. For i > 10^, on sub-arcminute scales, the statistical error rises due to the intrinsic ellipticity 
contribution which has a white noise power spectrum (assuming the intrinsic ellipticities are uncorrelated 



-3- 



and randomly oriented). However the statistical errors are roughly constant over the range of scales that 
provide the cosmological information (100 < I < 10^). This is useful for setting the permissible level of 
residual systematics. 

One of the main sources of error in the shear estimates comes from the convolution of the image by the 
point spread function (PSF). This function is known (albeit noisily) at the positions of the stars in the image. 
As the PSF varies across the image, one must interpolate this function to the positions of the galaxies. 
An incorrect model of the PSF leads to an error in the estimated (pre-seeing) galaxy shape and hence in 
the shear correlation. Coherent PSF patterns have non-zero two-point functions, which add to the lensing 
induced correlations in galaxy elUpticities. This systematic error can exceed statistical errors in lensing 
measurements if the PSF is not modeled sufficiently accurately (Hoekstra 2004). We will not consider here 
errors due to removal of the PSF from the galaxy shapes if the PSF at the location of the galaxy is known 
correctly (e.g. Kaiser, Squires, & Broadhurst 1995; Kaiser 2000; Bernstein & Jarvis 2002; Refregier & 
Bacon 2003). We are only concerned here with the estimation of the PSF at each galaxy's location. 

PSF interpolation error has been one of the primary sources of systematic error in most of the lensing 
measurements pubhshed to date (errors in the shear calibration and redshift distribution are the other main 
sources). Given a model for PSF anisotropy we can calculate how well the power spectrum would need 
to be corrected to be well below the statistical errors. The statistical error curves in Figure 1 give a good 
indication of the upper limit on coherent residual systematic errors if they are not to dominate the error 
budget. Thus at / ~ 1000, or 10 arcminute scales, the coherent residual should be well below 0.001 (so 
that its square is smaller than the statistical error curves). Generic models of PSF patterns do not exist; the 
amplitudes measured in current data (before any corrections) are in the range 1-10% with varying coherence 
scales. Telescopes that will be built with lensing as a primary science goal are expected to do better than 
these, and may have PSF modeUng software like TinyTim for HST, but even for the best-designed telescope, 
the galaxy shapes will require correction using data on stars. 

In this paper we describe two methods which in combination can remove the systematic effects of 
asymmetric PSFs in large imaging surveys. A method based on a principal component analysis (PCA) of 
the PSF was the subject of a recent paper (Jarvis & Jain 2004). Essentially, it detects and models components 
of the PSF pattern which appear in many different images. For example, guiding errors have the same effect 
on every star in an exposure, so its pattern is a constant in (x, y), with a coefficient which varies from 
exposure to exposure. The principal component corresponding to this is therefore a constant. Focus errors 
are similarly recurring; astigmatism produces a characteristic pattern when the telescope is slightly above 
focus, and the opposite pattern when below. That is, there is a fixed (.x.y) pattern which is modulated 
by a coefficient for each exposure. (There may be more than one principal component corresponding to 
focus if the variation is not quite linear as the telescope gets more out of focus.) In general, the principal 
components should model any pattern due to a recurrent physical cause. The second method discussed in 
this paper tackles PSF patters that no not recur in different exposures. 

In §2, we outline the pipeline for lensing measurements to show where different sources of error enter. 
We quantify the residual systematic errors due to the PSF pattern after performing the PCA interpolation 
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and show how they scale with survey area. We find that some components of the error scale as Area" . 
However, other components of the error remain roughly constant even with this interpolation scheme. We 
discuss how these components are ones that do not recur across different exposures. 

In §3, we show how to completely eUminate these non-recurring systematic errors by correlating only 
galaxy shapes measured on distinct exposures. Since the two- and three-point correlation functions encom- 
pass most of the lensing information that will be desired from current and future surveys, the combination of 
these two techniques will lead to the near elimination of systematic errors stemming from PSF interpolation. 

In §4 we discuss the key requirements for a survey to keep residual systematics sufficiently small. We 
list the ingredients that determine these residuals and describe how to estimate them for planned surveys as 
well as from actual survey data. Future work needed to test and refine this approach is discussed. 

2. Systematic Errors Due to PSF Interpolation 

2.1. The Lensing Pipeline 

We begin by outlining the pipeline used to estimate lensing statistics from images of the sky. This will 
help us identify the steps at which different errors enter and how new techniques can reduce certain errors. 
We will introduce the PCA technique below in step 3 on PSF interpolation and use it in subsequent sections 
to quantify residuals. 

The lensing pipeline can be summarized in 5 main steps: 

1. Detection of Stars and Galaxies 

Weak lensing surveys generally observe the same portion of the sky on several separate exposures. 
Each of these exposures are usually made up of multiple images, from the multiple CCD chips in the 
camera. These factors can make object detection and measurement somewhat compUcated. Generally, 

one wants to stack all of the images for a given part of the sky to get the best signal-to-noise for 
detecting objects. However, if the shapes are measured from the stacked image, there are issues due 
to correlated noise from the image-combining algorithm, and even slight registration errors can lead 
to very significant errors in the shape measurements. Also, the PSF on the stacked image will be near 
the middle of the range of seeing values, so the signal-to-noise for the smallest galaxies may actually 
be worse on the stacked image than on the best-seeing images. Worse, the PSF pattern on the stacked 
image will change abruptly at the edge of every input exposure, so if there are large offsets in the 
original pointings, the PSF pattern will be impossible to model precisely. 

Therefore, we generally recommend detecting objects on a stacked image, but measuring the PSF and 
galaxy shapes on the images from the individual exposures. This will allow for good PSF interpo- 
lation, and the keep the pixel noise uncorrelated. If the image registration is good enough, one can 
centroid on the stacked image and use it for the individual measurements, which may improve the 
signal-to-noise of the shape estimates. 
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In §3 we point out that using shapes from different exposures eliminates systematic errors due to 
certain PSF patterns; this also argues against stacking images to measure shapes. It will impact the 
optimal number of images to observe per location, which we discuss §3.1. 

For surveys with very many (more than about 10 or 20) exposures at each location, it may make sense 
to stack together several exposures which have roughly the same seeing and which are not (very) 
offset from each other. Each location would then have a smaller number of images for measuring 
shapes. For the purposes of this paper, the term exposure would then refer to these stacked images, 
rather than the original exposures. 

Finally, we remind the reader that the PSF can act as a matched filter for galaxies that are aligned in 

the same direction as it. This selection bias can introduce a systematic error. The error is eliminated by 
detecting galaxies which are as faint as possible, and then selecting according to an shape-independent 
signal-to-noise estimate. A similar error, which is more difficult to remove, is that galaxies are more 
likely to be blended along the direction of the PSF, which will bias galaxy shapes in the same direction. 

2. Measurements of the PSF 

After identifying stars and galaxies in an image, the stars are used to measure various aspects of the 
PSF. Different analysis methods use different components of the PSF, but all methods measure the 
ellipticity and size at least. More sophisticated analyses require some higher order shape information 
as well. Since the PSF generally varies between different exposures as well as across the field, these 
values are measured as a function of the positions (p^'^^ of stars in exposure i. 

There are three systematic errors which may be introduced at this stage. First, small galaxies may be 
falsely identified as stars, which will lead to errors in the PSF estimates. Second, if the PSF is color 

dependent, the PSF measured by the stars may not be (exactly) the same as the PSF which has acted 
on the galaxies. This color error may be redshift dependent which would complicate tomography 
analyses. Similarly, if the detector response is slightly non-linear, the PSF of the bright stars may be 
different from the PSF of the faint galaxies. 

3. Interpolation of the PSF 

We need to know the PSF at the location of the galaxies, which are the tracers of the lensing shear. 
Since the PSF is not measured at these locations, we need to interpolate the measurements from the 
locations of the stars. Here, we briefly describe how to do this using the principal component analysis 
(PCA) method (described in greater detail in Jarvis & Jain 2004): 

For each exposure, i, we find a polynomial fit to the PSF measurements, qI™^ (0*^'^ ), where the order 
m is given by the number of stars available for fitting: (m -I- l)(m -I- 2)/2 < N^^, and 4>^'^^ is the 
position measured in the coordinate system of exposure i. In practice, one may want to use a separate 
polynomial for each chip to avoid smoothing over discontinuities at the chip boundaries. 

Next we take the patterns for all of the exposures and pointings, as quantified by the coefficients in 
the polynomials Qi, and find the principal components of the variation. This will find the patterns 
that repeat over a significant number of the exposures. We sort the principal components according to 
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how much they contribute to the total variation of the PSF patterns (i.e. the singular values). At some 
point the components will not be important for describing the patterns, so we choose some cutoff and 
only use the first A^pc components. (We will discuss how to determine this number below.) Thus, the 
PSF pattern for each exposure is described as a weighted sum of the various principal components: 



where i is the index number of the exposure, e can represent the PSF ellipticity, or any other feature 
of the PSF, such as size or any higher order shape information, and Pk is the kih principal component 
for the same quantity. 

We can then refine these principal components, Pk, using a higher order polynomial, by keeping the 
Qik coefficients fixed and using the stars in all of the exposures for the fit. For very large surveys, this 
allows us to use significantly higher order polynomials which more accurately describe the compo- 
nents. 

4. Measurements of the galaxy shapes 

Given the relevant description of the PSF at the location of a galaxy, one can make an estimate of the 
galaxy's shape before convolution by the PSF. Our methods for doing this are described in Bernstein 
& Jarvis (2002), but there are other methods for this step as well. For the purposes of this paper, we 
will assume that the only errors introduced here are the measurement noise and the intrinsic shape 
noise of the galaxy. That is, if the knowledge of the PSF were perfect, we assume that this step would 
then produce perfectly unbiased estimates of the shear at each galaxy's location. 

In reality, there may be systematic errors due to the dilution correction (the effect of the size of the PSF 
on galaxy shapes; c.f. Hirata & Seljak 2003) or the shear calibration (the response of the distribution 
of galaxy ellipticities to the shear). These errors lead to multiplicative errors in the shear two-point 
correlations. We discuss in §4 how the improved PSF interpolation we describe would reduce the 
dilution errors as well. Recent studies (Huterer et al. 2005; Guzik & Bernstein 2005) show that the 
impact of these errors on cosmological parameter estimation is less severe than the additive errors due 
to PSF anisotropy, as they can be self-calibrated from the data.. 

5. Correlation of the shear estimates and comparison to theory 

The lensing shear information is contained primarily in the two- and three-point shear correlation 
functions. In fact, most other shear statistics, such as the shear variance and the aperture mass variance 
and skewness, can be expressed as integrals over these functions. Exceptions include the convergence 
probability distribution function (Zhang & Pen 2005), peak statistics (Jain & van Waerbeke 2000; 
Miyazaki et al. 2002) and topology measures (Matsubara & Jain 2001; Sato et al. 2001, 2003), which 
contain more information about the shear than is contained in its low-order correlations. 



TVpc 




(1) 
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There are two two-point shear correlation functions: 

^+{0) = {j{$h*($+9)) (2) 

UO) = {l{M$ + 0)) (3) 

where * indicates complex conjugate of the complex-valued shear estimate, and the shears are mea- 
sured relative to the Une joining the two galaxies. Likewise, there are four three-point shear correlation 
functions which are a function of the size and shape of the triangle connecting the three galaxies. 

Since other statistics may be derived from these, we take the correlation functions to be the final prod- 
uct of the lensing pipeline which is then used to constrain cosmology. For tomography applications, 
the correlation functions are measured as functions of redshift bins as well as the angular separation. 
The following discussion of the errors would then refer to the errors in the correlation functions for 
each pair (or triplet) of redshift bins. 

The estimation of cosmological parameters from the measured shear correlations relies on the use 
of redshift information. Errors in the estimated redshift bins (or the overall distribution of redshifts 
for non-tomographic appUcations) are an important systematic error which may be introduced at this 
point. Accurately calibrating the redshifts is a big concern for upcoming large cosmic shear surveys 
(Ma, Hu & Huterer 2005; Huterer et al. 2005). Spectroscopic sub-samples that extend to high redshifts 
may be necessary to calibrate the redshift distribution, though Mandelbaum et al. (2005) show how to 
some extent it can be calibrated from the data. 

There may also be systematic errors introduced by the theoretical predictions. In particular, current 
estimates of the non-linear power spectrum may have errors of order 5% at scales of several arcminutes 
(Smith et al. 2003), or even larger for quintessence models (Klypin 2003). (See Linder & White 
2005 for an improved prescription for generic dark energy cosmologies.) On small scales it is also 
necessary to consider baryonic physics (White 2004; Zhan & Knox 2004) and higher order effects, 
e.g. to account for the fact that galaxy shape measurements estimate the reduced shear g = ^/{l — k), 
not the shear 7 directly (White 2005; Dodelson et al. 2005). Theoretical predictions which do not 
correctly take this into account would introduce an error on small scales. 

We have seen that there are systematic errors which may be introduced in every step of the lensing 
pipeline. The errors from the PSF interpolation step have often been considered the most difficult to remove 
due to the limited number of stars per exposure. The most problematic of the other errors are the shear 
calibration and the redshift calibration. There is ongoing work aimed at limiting the impact of these errors 
on cosmological parameter estimation from future lensing data. However, they are not the focus of this 
paper; henceforth we restrict our discussion to the PSF interpolation errors. 
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2.2. Residual Systematic Errors after PCA Interpolation 

We now want to determine what the residual systematic errors in the estimates of the correlation func- 
tions are, due to imperfect PSF interpolation. Let the estimated shear in exposure i at position <^(') be 

7(<?^'^) = 7grav(<?(')) + ei„trinsic(<?(')) + e^ee^{$^'^) + epSF{$^'^) (4) 

where 7grav is the lensing signal, eintrinsic is the noise due to the intrinsic shape of the galaxies, emeas is the 
statistical error in the measurement from the photon shot noise, and epsF is the error in the shear estimate 
due to uncorrected PSF contamination. 

The systematic errors from the PSF interpolation enter through the term epsF> which arises from errors 
in the PSF elhpticity. Using equation 1, which expands it in principal components, we can write epsF as: 

k<Npc k>Npc 

where S refers to the error in the estimate of a quantity, and the last sum includes all of the patterns which 
are not modeled by the PCA, including any completely random effects which do not recur in multiple 
exposures. TZ represents the conversion from ellipticity to shear, which Bernstein & Jarvis (2002) refer 
to as responsivity^. When using this technique for other properties of the PSF besides ellipticity (size for 
example), TZ would be the corresponding mean effect that errors in the measurement have on the net shear 
estimates from the galaxy shapes. 

We will refer to the estimates of the two-point correlation function from observations of galaxies on 
two exposures, i and j as: 

^(i,J)(^0) = U($ii))^($O)+0)\ (6) 



where we omit the + and — subscripts, both here and in much of the further discussion, leaving the 
appropriate conjugation or not in the two cases implied. 

The statistical errors from the measurement noise and intrinsic ellipticities are well understood. We 
now look at what can contribute to the systematic PSF contamination, epsF, and how that propagates to 
|(^'-?'). The errors in the three-point function are completely analogous, so it is sufficient to only refer to the 
two-point function here. 



1. Errors in the principal components, Pk 

There will be errors in the estimates of the functions due to the simple fact that we constrain them 
with a finite number of stars. These lead to systematic errors in the correlation functions, since we use 
the same principal components for all of the exposures, so the errors in the P^ repeat for every pair of 
galaxies that is used to calculate the correlation function. 



'More generally, TZ is the net effect on the shear estimate due to an error in the PSF ellipticity, which may include other effects 
than that described by the TZ of Bernstein & Jarvis (2002). 
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If the error in each estimated is SP^, then the propagated error in the correlation function due to 
this is (from Equations 5 and 6): 

5|;if = ^ a,ka,e{sPk{$^''^)6Pei$'^^^+e)) (7) 

k,e<Npc 

2. Unmeasured principal components 

The above analysis only included a finite number of principal components, A^pc- Any PSF variation 
that is described by components of lower significance than these has been completely unmodeled. 
Therefore, all of this PSF power will still be uncorrected in the galaxy shapes, which will lead to a 
systematic error in the shear estimates: 

•^^pf = ^ E (2 E aiua^,{pS^^)5P,{$^^) + e)) 

k>NpG V e<Npc 

+ E 

{Pk{$^^)p,{$^^ + e)) (8) 

E Wjk{Pk{$^^)Pk{$^'^ +0)) (9) 

k>Npc 

where ajjt refers to the values that the coefficients for the unmeasured components would have if 
they were included in the analysis. The dominant terms in this expression will typically be the au- 
tocorrelation terms as written in Equation 9; however, it is possible that there could be significant 
correlations with either other unmeasured components or the errors in the measured components as 
shown in Equation 8. 

3. Non-recurring contributions to the PSF pattern 

Some portion of the PSF pattern is completely random and uncorrelated between different exposures, 
e.g. atmospheric effects. These non-recurring contributions will remain as a systematic error in the 
shear correlations since they can have spatial structure. Wittman (2005) has recently measured the 
atmospheric contribution to PSF errors from the Subaru telescope, while Kaiser, Tonry, & Luppino 
(2000) modeled its spatial and temporal coherence. The actual level of atmospheric contribution will 
depend on details of the instrument and observing strategy. Here we will consider the atmosphere as 
well as non-recurring contributions from the instrument in one category of PSF errors. 

These could be viewed as a subset of the unmeasured principal components described above, since the 
information here must be contained in a complete principal component analysis with A^pc equal to the 
total number of exposures. These uncorrelated contributions would be described by the myriad very 
low significance components which constitute most of those neglected by using a (much) lower Npc- 
However, we choose to make them a separate item to point out that these contributions are completely 
uncorrelated from one exposure to another, which make them a quahtatively different type of error. In 
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particular, this means that {aikajk) = for z 7^ j, so the systematic error only occurs for estimates of 
^ with i = j: 

4f = ^«aW {P^ti^^'^)P^ti$^'^ + ^1) (10) 

(i) 

where P^^ti^ is the portion of the PSF pattern which is uncorrelated with that from any other exposure. 

4. Errors in the coefficients a^^ 

There will be errors in the estimates of the aik coefficients as well, since they will be constrained by 
the finite number of stars in each exposure. These errors lead to systematic errors in the correlation 
function, since the amount of correction on the galaxies for each principal component will be slightly 
wrong. This will then add a Uttle bit of the correlation functions of the components to the shear 
correlation function: 

-^^"(4)^ = ^ E Cov(a,fe,a,,)(Pfc(#*))P,((^»+e"')) (11) 

Note that, like the previous systematic, this systematic is nonzero only for estimates of ^ with i = j, 
since the errors on the coefficients are uncorrelated between exposures. 



2.3. Scaling of Systematics with Survey Parameters 

Surveys with degree sized fields of view (FOV), covering total area of 1000 square degrees or larger, 
are hkely to have enough exposures and stars to make accurate corrections to PSF anisotropy. Here we 
quantify the residual PSF systematics described above and discuss their possible impact on survey strategy. 

The following parameters constitute our description of a lensing survey. 

Field of view, in steradians: flpoy 

Number of pointings: iVpoint 

Number of exposures per pointing: A^'cxp 

Mean number of stars per exposure: (N^) = AT* = n* J^fov 

Number of significant principal components: Npc 

The survey size is given by the number of pointings as: f^s = iVpoint^^FOV- For PSF measurement A^* 
includes only those stars that have well measured shapes, and which are robustly identified as stars. Inter- 
loping small galaxies are an additional concern if one tries to push the stellar locus too close to the galaxy 
locus of the size-magnitude diagram. The number of significant principal components used to describe the 
recurring PSF pattern, Npc, will not Ukely be known in advance of the data. In fact, it is still variable after 
obtaining data; we describe how to determine a good value for it in §2.3.1. 
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We will make the simplifying and conservative assumption that all exposures in a given part of the sky 

are centered on the same point, so they do not sample the PSF on different parts of the camera^. Hence the 
total number of PSF measurements is A'^^A'^point- These are used to measure the A^pc principal components. 
The maximum order of the polynomial that can be used for each principal component is then of roughly 
{N^Npoij^i/ Npcy/^ , which can be much larger than the order possible by using just the stars in a single 
exposure. 



2.3.1. Scaling of Errors 1 and 2 

The magnitude of the errors in the principal components scales according to the total number of stars 
used to constrain each component. Since all of the stars in the survey need to jointly constrain N-pQ compo- 
nents, we have 

The systematic error S^^^ will then scale as 



si^f^Npcispm'^-^ (13) 



since each element in the sum for in Equation 7 is quadratic in the 6Pk functions (assuming that the 

cross-terms involving different principal components are negligible). As survey size increases (increasing 
-Apoint)> this systematic error will decrease even faster than the statistical errors in the shear, which decrease 

— 1/2 

as N^oint ' presuming that the PC patterns do not evolve as the survey progresses. 

The error due to the neglected principal components, 5^|2) , ^^^^ ^'^^J' decrease if we increase A^pc, 
since the largest neglected components will then have smaller rms amplitude. However, increasing Npc 
will increase the previous error, since each principal component will be less well measured. In an ideal 
analysis, the number of principal components would be set so that the systematic errors for each of these 
two factors is stationary in the number of components. That is, the improvement due to adding an additional 
component should exactly offset the loss due to the other components being slightly less well measured. In 
general, it is not easy to determine at what A^pc this will happen. One needs to look at some measure of 
the contamination as a function of ATpc to find the minimum total contamination. We discuss a few such 
measures in §4. 

If such a procedure is done, then the amplitude of the first neglected component will scale approxi- 



^If this is not true, and the exposures are offset from each other, then the total number of PSF measurements increases by a 
factor of Ncxp, since the stars in every exposure give a new sample of the PSF patterns. If the exposures have the same pointing, 
then the extra exposures per location do not provide additional constraints on the PSF pattern. 
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mately as 

Furthermore, in our CTIO survey data, we have found that the large- asymptotic behavior of |Pfe| is 

|Pfe(0)| oc e-"'^ (15) 
with a of order 0.02. Therefore, we can estimate 5^(2) 

,5|gf oc Yl \Pkm' ^ ^\PN,a+M' (16) 
fc>Arpc 
Npc 

oc — — (17) 

aiV^iVpoint 

Assuming the asymptotic behavior is relatively generic and that the necessary increase in TVpc occurs sig- 
nificantly more slowly than the increase in Appoint; both systematics would scale as N~^^^^. 

However, we should point out that one is also limited by the constraint that Npc < A^*, otherwise 
the coefficients Oj^ cannot be measured: so if too many principal components become important, it will 
eventually become impossible to include all of them, and Si,^^^^ will not scale as Ap^j^^ any further. Thus, 
it is important in designing a survey to try to minimize the number of sources of PSF variation to keep the 
number of principal components reasonably low. 



2.3.2. Scaling of Errors 3 and 4 

The error from the atmosphere's PSF, S^^^^\ is essentially constant with survey size. Some of the 
atmosphere's PSF pattern will be modeled by the various principal components, but most of the PSF power 
will remain as a systematic error, especially the high order power, which will almost always be completely 
different from the high order power of the PC's. In particular, the atmosphere's pattern on scales smaller than 
the stellar separation cannot be modeled by the PCA or any other method. The magnitude of this contribution 
seems to be relatively small for current surveys. But for upcoming larger surveys, its contribution may 
become dominant over the residuals from the coherent patterns due to the telescope. Space-based surveys 
will not have this contribution, although it is possible that they will have other sources of PSF patterns which 
are uncorrelated between exposures. 

The errors in the coefficients ai^k do not scale with the number of pointings, since they are constrained 
only by the stars in a single exposure. For each exposure, i, there are A* stars which are used to constrain 

A^PC coefficients. Take the stars in an exposure to be numbered m. = l..N.^, with positions (j)^, shapes e^, 
and shape uncertainties c7„i. Also, define A to be the vector of coefficients for that exposure (Af^ = ai^k), 
define a vector E with Em = &ml(^m^ and define a matrix Q with Qm,k = Pk{(t^m,) / (^k- Then the least- 
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squares solution for A is: 

A = {Q^Q)-\Q^E) (18) 
Cov(^) = (Q^Q)-^ (19) 

If iV* < Npc, then Q^Q is singular and the errors on A are infinite. If iV* = Npc then there is some 
combination of coefficients whose error is proportional to the largest cr^. If we sort the stars by am, so that 
the largest cr^ is at m = AT^, then 

a{d) = KaN. (20) 

where a is the least well measured linear combination of coefficients and K is a constant which depends on 
the values of Pk{4'm)- K is generally of order the rms value of Pkifpm), but it can be arbitrarily larger for 
unfortunate sampling of the principal components^ . Finally, if A^* > A'pc (the usual case), then it can be 
shown that 

a{a) >K'( ) (21) 

\fc>Afpc ^ / 

where the sum is over the {N^ — Npc + 1) least well-measured stars^. Since fj^ oc l/u, where u is the 
signal-to-noise of the star, the sum in the above formula is dominated by the highest signal-to-noise stars. 
Thus the overall error scales roughly as the shape error of the A^pc-th brightest star. The fainter stars do not 
help very much. Since S^^^^^^ contains a sum over the elements of Gov (A), it scales similarly. 



3. Multi-exposure Correlation Functions 

In the preceding section, we found two contributions to the systematic error in the correlation functions 
which do not scale with the survey size. However, notice that both of these, S^^^ and 6^^^-^ only exist for 
estimates of the correlation function which use shear estimates from the same exposure (i = j). 

There is a simple way to eUminate such systematic errors in the correlation function: for each pair of 
galaxies used in estimating the two-point function, use galaxy shapes measured from different exposures. In 
other words, only use pairs with i ^ j. Since the atmospheric component of epsF and that from the errors 
in the coefficients are uncorrelated between exposures i and j, there is no systematic bias in the estimates 
of the correlation function. We have thus used the fact that the atmosphere has spatial coherence in any 
given exposure, but gets uncorrelated rapidly between distinct exposures (as long as they are not taken in 
immediate succession). The same holds for some types of instrumental systematics which are not correlated 



^For example, if all of the stars happen to be where some component has very little power, then they will not be able to constrain 
the coefficient of this component very well. 

'^The proof of this expression is somewhat technical, but we direct the interested reader to Golub & van Loan (1996), p. 443. 
The derivation of our formula is based on their proof of Theorem 8.5.3 regarding singular values of a diagonal matrix plus a rank-1 
matrix. 
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between exposures taken on different nights. The systematic errors eUminated by this technique are what 
we have been calling S^^^^^^ and S^^^^^ . So for these two components of the error, 6^ = 0. 

For the three-point (or n— point) correlation function, the same argument holds as long we have at least 
three (or n) exposures. With the shapes all taken from different exposures, the systematic errors from the 
atmosphere and the coefficient estimates are eliminated, leaving only the errors from the PC measurements 
and the neglected PCs to contribute to the systematic error. 

By 6^, we have been referring to a systematic error, or bias; that is, a change in the expectation value 
relative to the correct value. So correlating across multiple exposures results in no systematic error from the 
effects we have numbered 3 and 4. However, these errors (all four, actually) also contribute to the statistical 
error in ^, since the variance of ^ due to these two errors does not vanish. This contribution to the statistical 
error has yet to be accurately estimated, but we expect it to be small compared to the sample variance plus 
intrinsic ellipticity errors (cintrinsic from Equation 4). 

Wittman (2005) used a set of exposures of a single field imaged with the Subaru telescope to estimate 
the atmospheric contribution. This is a concern on arcminute scales or smaller, for which the PSF correction 
may not be accurate even with PCA interpolation if the PSF pattern is non-recurrent. Wittman (2005) finds 
a contribution to the shear correlation of order 10^^ on arcminute scales. This may be compared to the 
contribution from intrinsic ellipticities, which is of order 10^^, but scales inversely with the total number 
of galaxy pairs. The atmospheric contribution scales inversely with the number of independent coherent 
patches, which depends on the coherence scale of the atmosphere. Unless this scale is much larger than 
an arcminute, the atmospheric contribution will be comparatively small. The contribution of the other three 
errors to the statistical error budget is also likely to be small, but it needs to be estimated for planned surveys. 



3.1. Optimal Number of Exposures per Pointing 

How many exposures should one take per pointing? We have advocated multiple exposures in the 
discussion above to be able to use galaxies in different exposures to measure shear correlations. While this 
eliminates certain systematic errors, by omitting the i = j terms in the correlation function estimates, we 
are losing some information. 

Assume each pair of galaxies which are being used for the two-point correlation function are each 
observed on A^exp exposures and have a measurement error, emeas^ on each exposure equal to a-^/N^ (so 
the measurement error on a stacked image would be a). The variance of ^ when ihei = j pairs are neglected 
is found to be: 

For well measured galaxies, the last term is negligible (assuming A^gxp > 2); however, for faint galaxies, it 
becomes important. One generally limits one's measurements to galaxies with cr < cr-y 0.25, since the 
measurments of fainter galaxies will often be unstable. For a cr-y, we see that the fractional increase in 
the noise from omitting the i = j pairs is Nexp/^{Nexp — which is somewhat significant for only 2 



-15- 



exposures, but is small for 5 exposures. 

If the shape uncertainties vary significantly between exposures, then our approximation that each shear 
error is ay^Nexp would be incorrect. A more careful analysis in this case suggests using enough exposures 
so that there are at least 2 or 3 with "good" measurements of the shapes. For typical variations in the seeing 
quahty, 5 exposures is probably still sufficient. 

On the other hand, with large A^cxp, the measurements of the shapes on each exposure becomes harder 
relative to the measurement on a stacked image. If the signal-to-noise on an individual image drops to near 
unity, the measurements may fail to provide any kind of useful value. Even signal-to-noise values of 5 or 
10 often create problems. So to avoid having to discard many galaxies which would be measurable on a 
stacked image, we definitely want to limit N^xp to at most 10 or so. 

Therefore, we suggest a minimum of 5 and a maximum of about 10 exposures per pointing for mesure- 
ments of shear correlations^ . For surveys planning to take very many exposures per pointing, one would 
want to stack subsets of the exposures into 5-10 stacked images and treat these sub-stacks as the exposures to 
which we have been referring. Sorting the original images by seeing radius before stacking would probably 
be the best strategy in order to not wash out the best-seeing images for at least 2 or 3 of the sub-stacks. 

3.2. Effect on Recurring Principal Components 

The multi-exposure technique may also help the first two systematic errors somewhat. The equations 
for these errors have terms with (ajfeflj^) in them. With i ^ j, the two coefficients will often be uncorrelated, 
so this reduces to {aik) {aje). Then, if the coefficients for either component k or £ have zero expectation 
value, then these terms will vanish as well. Even if the expectation values are not exactly zero, they may 
often be much smaller than the rms, so the z 7^ j terms may still be much smaller than the autocorrelation 
i = j terms. 

Not all coefficients will be uncorrelated between different exposures. For example, for ground-based 
telescopes, some components may correspond to telescope flexure when the telescope points in a particular 
direction. If a given location is always observed at similar hour angle, the coefficients for these components 
will be correlated. 



4. Discussion 

This paper has been concerned with the effect of PSF anisotropy patterns on systematic errors in weak 
lensing surveys. We have suggested the use of galaxy shapes measured in distinct exposures to estimate 
shear correlations as a way of eliminating the systematic error due to non-recurrent PSF patterns. Jarvis & 



'An absolute minimum of 3 exposures is required to use our multi-exposure trick for the three-point correlation functions. 
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Jain (2004) showed that recurrent PSF patterns can be accurately measured using a Principal Component 
Approach. By using these two techniques in lensing pipelines, systematic errors due to generic PSF patterns 
can be interpolated (and therefore corrected) to high accuracy. 

In planning a large-area cosmic shear survey, we have shown that the key factors that enable accurate 
PSF corrections are: sufficiently many well-measured stars in all parts of the sky; 5-10 exposures per point- 
ing; sufficiently few important principal components, which cannot exceed the number of stars per exposure. 
In addition, the principal components can be estimated better if dense stellar fields are imaged on regular 
intervals, and if there are few changes in the instrument over the course of the survey (as these can introduce 
new principal components). 

Another consideration for minimizing the number of important principal components is to keep the 
observing conditions as stable as possible. For each underlying physical cause of PSF variation, one can 
essentially do a Taylor expansion of the PSF pattern with respect to that variable. The PCA will need a 
separate component for each term in the Taylor expansion which has a significant amplitude. Thus, one 
should try to keep such variations (eg. focus error, component misalignments, mirror flexure, etc.) small 
enough that one or two terms in the expansion are sufficient to adequately describe the effect on the PSF 
pattern. One can estimate what limits are sufficient through spot-diagram ray-tracing programs. 

The second goal of this paper was to provide a formalism to estimate residual systematics due to PSF 
errors. The ingredients needed to apply our formalism are an estimate of typical PSF power spectra and of 
the number of significant principal components of PSF patterns. For planned surveys, this is best accom- 
pfished by generating PSF patterns in a given exposure by ray tracing through the telescope optics. Mock 
surveys can then be generated by modeling the atmosphere and the variation of instrumental parameters 
over the course of the survey. The resulting models of PSF patterns can be used with the formalism of §2 to 
find telescope parameters and survey strategy that minimize residual systematics. The difficulty in getting 
reliable estimates of residual systematics will be in including all relevant factors which may affect the PSF, 
many of which may be subtle and hard to anticipate. But the benefit of such an exercise is the abiUty to 
optimize instrument and survey parameters for lensing measurements. 

Further, once data is taken, comparison of the measured principal components with the models will 
help validate the error analysis. Our formalism can be applied to survey data to estimate residual systematic 
errors. If systematics turn out to be significant, empirical estimation allows one to incorporate them in the 
error budget for cosmological parameters. In addition, the following tests provide independent checks of the 
estimate of systematic errors from survey data (note that at least the latter two tests can be applied to model 
PSF patterns for planned surveys as well): 

• Stellar ellipticity correlations 

For analysis methods where the corrected stars are not degenerately round, the two- and three-point 
correlation functions of the corrected stars can be a measure of how well the interpolation is removing 
the systematic contributions to the correlation. For a better check, one can perform the PSF corrections 
with only half the stars, and look at the resulting correlations of the other half. 
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• Cross-correlation of galaxies with foreground stars 

This will provide a somewhat more direct measure of the contamination from the interpolation, since 
the galaxies use the interpolated PSF. Again, one can spUt the stars in half for a better check. 

• E/B mode analysis 

The shear field can be decomposed into curl-free (E) and divergence-free (B) modes (Schneider et al. 
1998; Crittenden et al. 2001). Most PSF effects have roughly equal power in the E and B-modes, 
while the lensing signal is (almost) only in the E-mode. So any residual PSF contamination should 
show up in the B-mode. However, there are some cosmological sources of B-mode power (Crittenden 
et al. 2001; Schneider et al. 2002), so when these become important, this check will only provide an 
upper limit to the contamination due to the PSF and other systematic effects (e.g. Vale et al 2004). 

• Higher order correlations 

Higher order correlation functions, in particular the three-point function, provide some independent 
checks on systematic errors. The three-point function of the gravitational shear vanishes at lowest 
order in the density in the quasilinear regime, so it is non-zero only at fourth-order in the density, but 
it would likely have a third order contribution from systematic errors. That is, the relative contribution 
of systematics could be higher than it is in the two-point function. Further, there are multiple three- 
point functions that contain B-mode contributions, which would in general behave differently from 
the two-point functions. The three-point function also has a shape dependence that should reveal its 
gravitational origin and depends somewhat differently on cosmological parameters than the two-point 
function (Bernardeau et al. 1997; Takada & Jain 2004). 

Finally we note that the methods described here would reduce systematic errors due to the correction for 
the size of the PSF in addition to those due to the anisotropic PSF described above. A round PSF smoothes 
the images of the galaxies, making them appear less elliptical. A multiplicative "dilution correction" (part 

of the shear polarizability in the KSB formalism) is therefore needed to obtain the pre-seeing shape of 
the galaxies. PCA interpolation provides better estimates of the size of the PSF, which is needed for this 
correction. And the muti-exposure trick would eliminate the contribution of dilution errors due to non- 
recurrent patterns in the shear correlation functions. A detailed study of the resulting improvements is left 
for further work. 

We thank David Rusin, Peter Schneider, Fritz Stabenau and David Wittman for help and comments, 
and an anonymous referee for suggestions. This work is supported in part by NASA grant NAGS- 10924 and 
by a Keck foundation grant. 
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